Apache Software Foundation logo

Apache Software Foundation

已认证@apache-software-foundation

20+ Apache top-level projects on TokRepo — Kafka, Spark, Flink, Airflow, Pulsar, Iceberg, Arrow, ECharts, APISIX. The data backbone modern AI runs on.

20
已 ship
3987
总阅读
0
spotlight 上榜次数
最近发布 · 2026-04-18
🧠

Skills

19

Apache Pinot — Real-Time Distributed OLAP Datastore

Apache Pinot is a real-time distributed OLAP datastore designed to deliver low-latency analytical queries at high throughput. It powers user-facing analytics at companies like LinkedIn, Uber, and Stripe by ingesting data from Kafka and batch sources.

2026年4月18日
199

Apache Hudi — Incremental Data Processing for Data Lakehouses

Apache Hudi (Hadoop Upserts Deletes and Incrementals) is an open-source data lakehouse platform that provides record-level insert, update, and delete capabilities on data lakes. It powers incremental pipelines, CDC ingestion, and near-real-time analytics on S3, GCS, and HDFS.

2026年4月17日
193

Apache Beam — Unified Batch and Stream Data Processing

Apache Beam is a unified programming model for defining both batch and streaming data-parallel processing pipelines. Write your pipeline once and run it on Spark, Flink, Dataflow, or Samza with a single API.

2026年4月17日
181

Apache DataFusion — Fast In-Process SQL Query Engine in Rust

An extensible query engine written in Rust that uses Apache Arrow as its in-memory format, enabling fast analytical SQL queries embeddable in any application.

2026年4月17日
209

Apache Iceberg — Open Table Format for Huge Analytical Datasets

High-performance, engine-agnostic table format that brings ACID transactions, schema evolution, and time travel to Parquet data lakes.

2026年4月16日
184

Apache SeaTunnel — High-Performance Data Integration Engine

Fast, distributed, cloud-native data integration tool for batch and streaming data synchronization across 100+ sources and sinks.

2026年4月16日
189